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Abstract 


This paper concerns the introduction of a new Markov Chain Monte Carlo scheme for 
posterior sampling in Bayesian nonparametric mixture models with priors that belong to 
the general Poisson-Kingman class. We present a novel compact way of representing the 
infinite dimensional component of the model such that while explicitly representing this 
infinite component it has less memory and storage requirements than previous MCMC 
schemes. We describe comparative simulation results demonstrating the efficacy of the 
proposed MCMC algorithm against existing marginal and conditional MCMC samplers. 


1 Introduction 

According to Ghahramani ||9|, models that have a nonparametric component give us more flexiblity that 
could lead to better predictive performance. This is because their capacity to learn does not saturate hence 
their predictions should continue to improve as we get more and more data. Furthermore, we are able 
to fully consider our uncertainty about predictions thanks to the Bayesian paradigm. However, a major 
impediment to the widespread use of Bayesian nonparametric models is the problem of inference. Over 
the years, many MCMC methods have been proposed to perform inference which usually rely on a tailored 
representation of the underlying process EiiiiHiiioiiiiiia. This is an active research area since dealing 
with this infinite dimensional component forbids the direct use of standard simulation-based methods for 
posterior inference. These methods usually require a finite-dimensional representation and there are two 
main sampling approaches to facilitate simulation in the case of Bayesian nonparametric models: random 
truncation and marginalization. These two schemes are known in the literature as conditional and marginal 
samplers. 

In conditional samplers, the infinite-dimensional prior is replaced by a finite-dimensional representation 
chosen according to a truncation level. In marginal samplers, the need to represent the infinite-dimensional 
component can be bypassed by marginalising it out. Marginal samplers have less storage requirements 
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than conditional samplers but could potentially have worst mixing properties. However, not integrating out 
the infinite dimensional compnent leads to a more comprehensive representation of the random probability 
measure, useful to compute expectations of interest with respect to the posterior. 

In this paper, we propose a novel class of MCMC samplers for Poisson-Kingman mixture models, a very 
large class of Bayesian nonparametric mixture models that encompass all previously explored ones in the lit¬ 
erature. Our approach is based on a hybrid scheme that combines the main strengths of both conditional and 
marginal samplers. In the flavour of probabilistic programming, we view our contribution as a step towards 
wider usage of flexible Bayesian nonparametric models, as it allows automated inference in probabilistic 
programs built out of a wide variety of Bayesian nonparametric building blocks. 


2 Poisson-Kingman processes 

Poisson-Kingman random probability measures (RPMs) have been introduced in Pitman 1231 as a general¬ 
ization of homogeneous Normalized Random Measures (NRMs) i^[T3l . Let X be a complete and separable 
metric space endowed with the Borel cr-held let /r ~ CRM(p, Hq) be a homogeneous Completely 

Random Measure (CRM) with Levy measurep and base distribution Hq on this space, see Kingman IBIfor 
a good overview about CRMs and references therein. Then, the corresponding total mass of p, is T = p(X) 
and let it be finite, positive almost surely, and absolutely continuous with respect to Lebesgue measure. For 
any t e R+, let us consider the conditional distribution of p/f given that the total mass T e dt. This distri¬ 
bution is denoted by PK(p, SijHq), it is the distribution of a RPM, 6t denotes the usual Dirac delta function. 
Poisson-Kingman RPMs form a class of RPMs whose distributions are obtained by mixing PK(p, StjHg), 
over t, with respect to some distribution 7 on the positive real line. Specifically, a Poisson-Kingman RPM 
has following the hierarchical representation 


T~7 

P\T = t-PK{p,dt,Ho). (1) 

The RPM P is referred to as the Poisson-Kingman RPM with Levy measure p, base distribution Hq and 
mixing distribution 7 . Throughout the paper we denote by PK(p, 7 , Hq) the distribution of P and, with¬ 
out loss of generality, we will assume that '-f{dt)cch{t) fp{t)dt where fp is the density of the total mass T 
under the CRM and h is a non-negative function. Note that, when 7 (df) = fp{t)dt then the distribution 
PK(p, fp, Hq) coincides with NRM(p, Hq). The resulting P = X;fc>i Pk^(t)k almost surely discrete and 
since p is homogeneous, the atoms {(j)k)k^i of P are independent of their masses {pk)k^i and form a se¬ 
quence of independent random variables identically distributed according to Hq. Finally, the masses of P 
have distribution governed by the Levy measure p and the distribution 7 . 

One nice property is that P is almost surely discrete: if we obtain a sample from it, there is a positive 

probability of Yi = Yj for each pair of indexes i ^ j. This induces a random partition H on N, where i 
and j are in the same block in H if and only if = Yj. Kingman m showed that 11 is exchangeable, this 
property will be one of the main tools for the derivation of our hybrid sampler. 

2.1 Size-biased sampling Poisson-Kingman processes 

A second object induced by a Poisson-Kingman RPM is a size-biased permutation of its atoms. Specifically, 
order the blocks in H by increasing order of the least element in each block, and for each fc e N let be 
the least element of the fcth block. is the index among of the first appearance of the fcth unique 

value in the sequence. Let = p{{Yzf,}) be the mass of the corresponding atom in p. Then {Jk)k^i is 
a size-biased permutation of the masses of atoms in p, with larger masses tending to appear earlier in the 
sequence. It is easy to see that Xifc>i ^k = T, and that the sequence can be understood as a stick-breaking 
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construction: starting with a stick of length Tq = T; break off the first piece of length Ji; the surplus length 
of stick is Ti = To — Ji; then the second piece with length is broken off, etc. 

Theorem 2.1 of Perman et al. jm states that the sequence of surplus masses {Tk)k;^o forms a Markov chain 
and gives the corresponding initial distribution and transition kernels. The corresponding generative process 
for the sequence is as follows: 


i) 

ii) 


Start with drawing the total mass from its distribution Vp^h,Ho ^ dt)cch{t) fp{t)dt. 

The first draw Yi from Pisa size-biased pick from the masses of p,. The actual value of Yi is simply 
Y* ~ Hq, while the mass of the corresponding atom in p is Ji, with conditional distribution 


Fp^h.HoiJi 6 dsi\T e dt) = 


with surplus mass Pi = T — Ji. 


iii) For subsequent draws i ^ 2: 

- Let K be the current number of distinct values among Yi, ..., li_i, and Y,*,..., Y^ the 
unique values, i.e., atoms in fi. The masses of these first K atoms are denoted by Ji,..., Jx 
and the surplus mass is Tk = T — 

- For each k ^ K, with probability Jk/T, we set Y^ = Y^. 

- With probability Tk/T, Yi takes on the value of an atom in p besides the first K atoms. The 
actual value Y^^^ is drawn from Hq, while its mass is drawn from 

Vp,h,Ho{JK+i e dsK+i\TK e dtx) = p{dsK+i) ^ , Tk+i = Tr—Jk+i- 


By multiplying the above infinitesimal probabilities one obtains the joint distribution of the random elements 

T, n, (JO.^iand (Y/),^i: 


^p,h,Ho(nn = ick)ke[K].Yk e dyk,Jk^ dsk for ke[K],T e dt) (2) 

K 

= - Iif=i Sk)Ht)dt s^^'‘^p{dsk)Ho{dyl), 

k=l 

where {ck)ke[K] denotes a particular partition of [n] with K blocks, ci,..., ck, ordered by increasing least 
element and \ck\ is the cardinality of block Cfc. The distribution (|^ is invariant to the size-biased order. Such 
a joint distribution was first obtained in Pitman ll23l , see also Pitman ll24l for further details. 

2.2 Relationship to the usual Stick-hreaking construction 

In the generative process above, we mentioned that it is reminiscent of the well known stick breaking con¬ 
struction from Ishwaran & James ifT^ . where you break a stick of length one, but it is not exactly the same. 
However, by starting with equation (|^, we can recover the usual construction due to two useful identities 

in distribution: P., = „ ^ Y = ■ ^ for j = 1,... ,K. Indeed, we can reparameterize 

Je J i 

the model using these identities and then obtain the corresponding joint in terms of K (0, l)-valued stick¬ 
breaking weights which correspond to the usual stick-breaking representation. Note that this joint 

distribution is for a general Levy measure p, density fp and it is conditioned on the valued of the random 
variable T. Even so, we can recover the standard Stick breaking representations for the Dirichlet and Pitman- 
Yor processes, for a specific choice of p and if we integrate out T. However, in general, these stick-breaking 
random variables form a sequence of dependent random variables with a complicated distribution, except 
for the two previously mentioned processes, see Pitman 12^ for details. 
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2.3 Poisson-Kingman mixture model 


We are mainly interested in using Poisson-Kingman RPMs as a building block for an infinite mixture model. 
Indeed, we can use Equation Q as the top level of the following hierarchical specification: 

T~7 

P\T ^PKip,,6T,Ho) 

Y,\P~P 

Xi I Y, ~ F(- I Y,) (3) 

where F{- | K) is the likelihood term for each mixture component, and our dataset consists of n observations 

{xi)ie[n] of the corresponding variables (W^i)ie[n]- We will assume that F{- \ Y) is smooth. After specifying 
the model we would like to carry out inference for clustering and/or density estimation tasks. We can do it 
exactly and more efficiently than with known MCMC samplers with our novel approach. In the next section, 
we present our main contribution and in the following one we show how it outperforms other samplers. 

3 Hybrid Sampler 

Equation’s joint distribution is written in terms of the first K size-biased weights. In order to obtain 
a complete representation of the RPM, we would need to size-bias sample from it for a countably infinite 
number of times. Succesively, some way of representing exactly this object in a computer with hnite memory 
and storage is needed. 

We introduce the following novel strategy: starting from equation (|^, we exploit the generative process of 
section 2.1 when reassigning observations to clusters. In addition to this, we reparameterize the model in 
terms of a surplus mass random variable V = T — Jk and end up with the following joint distribution: 

K 

]Pp,ft,rro(n^ = (c/c)fce[K],Efc* e dyl,Jke dskforke [K],T - ^ e dv,Xi e dx, for i e [n]) (4) 

k = l 

K / K \ K 

= (?^ + XI N + X n Pidsk)Ho{dyt) X[ F{dx^\yt). 


k = l 


k=l 


k=l 


Eor this reason, while having a complete representation of the infinite dimensional part of the model we only 
need to explicitly represent those size-biased weights associated to occupied clusters plus a surplus mass term 
which is associated to the rest of the empty clusters, as Eigure 1 shows. The cluster reassignment step can be 
seen as a lazy sampling scheme since we explicitly represent and update the weights associated to occupied 
clusters and create a size-biased weight only when a new cluster appears. To make this possible we use the 
induced partition and we call Equation 0 the varying table size Chinese restaurant representation because 
the size-biased weights can be thought as the sizes of the tables in our restaurant. In the next subsection, 
we compute the complete conditionals of each random variable of interest to implement an overall Gibbs 
sampling MCMC scheme. 

3.1 Complete conditionals 

Starting from equation 0. we obtain the following complete conditionals for the Gibbs sampler: 

P(l/ G d^; I Rest)a: (^ | fp{v)h ( ^ + J 


k=l 


k = l 


J, e ds, I Rest 


^ OC ( U -f Si + X Sfc I + Sj + X Sfc I S ■''‘'/9(dSi)I(o , Surpmass^ 

V / V k^i ) 
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Figure 1; Varying table size Chinese restaurant representation for observations 


where Surpmass^ = V + T!j=i Jj “ I]j<j Jj- 


P(Ci = c 


c_i, Rest)oc 


s,F{dx, I {X,}^^Y:) 
ffFidx. I 1;=^) 


if i is assigned to existing cluster c 
if i is assigned to a new cluster c 


According to the rule above, the ith observation will be either reassigned to an existing cluster or to one of 
the M new clusters in the ReUse algorithm as in Favaro & Teh If it is assigned to a new cluster, then we 
need to sample a new size-biased weight from the following: 

Every time a new cluster is created we need to obtain its corresponding size-biased weight which could 
happen 1 ^ R ^ n times per iteration hence, it has a significant contribution to the overall computational 
cost. For this reason, an independent and identically distributed (i.i.d.) draw from its corresponding complete 
conditional is highly desirable. In the next subsection we present a way to achieve this. Finally, for 
updating cluster parameters where Hq is non-conjugate to the likelihood, we use an 

extension of Favaro & Teh El’s ReUse algorithm, see Algorithm 2 in the supplementary material for details. 

The complete conditionals in Equation (|^ do not have a standard form but a generic MCMC method can be 
applied to sample from each within the Gibbs sampler. We use slice sampling from Neal lfT9ll to update the 
size-biased weights and the surplus mass. However, there is a class of priors where the total mass’s density 
is intractable so an additional step needs to be introduced to sample the surplus mass. In the next subsection 
we present two alternative ways to overcome this issue. 


3.2 Example of classes of Poisson-Kingman priors 

a) cr-Stable Poisson-Kingman processes 123 . Eor any a e (0,1), let /cr(f) = 
—sin(7rgj) be the density function of a positive cr-Stable random variable and 

p{dx) = pa{dx) := x~'^~^dx. This class of RPMs is denoted by PK{pcr, hx, Hq) where h is a 

function that indexes each member of the class. Eor example, in the experimental section, we picked 3 
choices of the h function that index the following processes: Pitman-Yor, Normalized Stable and Normal¬ 
ized Generalized Gamma processes. This class includes all Gibbs type priors with parameter a G (0,1), 
so other choices of h are possible, see Gnedin & Pitman ifTOll and De Blasi et al. IT] for a noteworthy 
account of this class of Bayesian nonparametric priors. In this case, the total mass’s density is intractable 
and we propose two ways of dealing with this. Eirstly, we used Kanter HEl’s integral representation for the 
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(T-Stable density as in Lomeli et al. ini, introduce an auxiliary variable Z and slice sample each variable; 
P (y e du I Rest)oc ^ exp |^—A(z)j h ^ + 

P (Z e dz I Rest)ocA(z) exp |^—A(z)jd2:, 

see Algorithm 1 in the supplementary material for details. Alternatively, we can completely bypass the 
evaluation of the total mass’s density by updating the surplus mass with a Metropolis-Hastings step with 
an independent proposal from a Stable or from an Exponentially Tilted Stable(A) . It is straight forward to 
obtain i.i.d draws from this proposals, see Devroye Q and Hofert ifTTll for an improved rejection sampling 
method for the Exponentially tilted case. This leads to the following acceptance ratio; 


P {V e dv' I Rest) fa{v) exp (—Au) 
P (y G du I Rest) fa{v') exp (—Aw') 



—n 

1 h 1 


) du' exp (—u) 


—n 

1 h 1 

+ SLi Si) 

du exp (—y) 


see Algorithm 4 in the supplementary material for details. Einally, to sample a new size-biased weight; 

^ tis/j;-|_i I Rest^ccy*cr(w s/g-|_i)s^_j_2lI(o,t’)('^fc-t-i)ti5/2-|_i. (7) 

Eortunately, we can get an i.i.d. draw from the above due to an identity in distribution given by Eavaro et al. 
a for the usual stick breaking weights for any prior in this class such that cr = ^ where rt < w are coprime 
integers. Then we just reparameterize it back to obtain the new size-biased weight, see Algorithm 3 in the 
supplementary material for details. 


b) logBeta-Poisson-Kingman processes (lH l2^ . Let/p(f) = r7a°)rw ^^p {-at) (1 - exp(-f))^ ^ 

be the density of a positive random variable X = — log Y, where Y ~ Beta(a, h) and p{x) = 

class of RPMs generalises the Gamma process but has similar properties. In¬ 
deed, if we take 6=1 and the density function for T is 7 (f) = fp{t) we recover the Levy measure and total 
mass’s density function of a Gamma process. Einally, to sample a new size-biased weight; 

If 6 > 1, this complete conditional is a monotone decreasing unnormalised density with maximum at 6 . We 
can easily get an i.i.d. draw with a simple rejection sampler 111 where the rejection constant is bv and the 
proposal is t/(0, v). There is no other known sampler for this process. 


3.3 Relationship to marginal and conditional MCMC samplers 

Starting from equation ([^, another strategy would be to reparameterize the model in terms of the usual stick 
breaking weights. Next, we could choose a random truncation level and represent finitely many sticks as in 
Eavaro & Walker jT]. Alternatively, we could integrate out the random probability measure and sample only 
the partition induced by it as in Lomeli et al. ini. Conditional samplers have large memory requirements 
as often, the number of sticks needed can be very large. Eurthermore, the conditional distributions of the 
stick lengths are quite involved so they tend to have slow running times. Marginal samplers have less storage 
requirements than conditional samplers but could potentially have worst mixing properties. Eor example, 
Lomeli et al. ini had to introduce a number of auxiliary variables which worsen the mixing. 
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Algorithm 

a 

Running time 

ESS(+std) 

Pitman-Yor process (0 = 10) 

Hybrid 

0.3 

7135.1(28.316) 

2635.488(187.335) 

Hybrid-MH (A = 0) 

0.3 

5469.4(186.066) 

2015.625(152.030) 

Conditional 

0.3 

NA 

NA 

Marginal 

0.3 

4685.7(84.104) 

2382.799(169.359) 

Hybrid 

0.5 

3246.9(24.894) 

3595.508(174.075) 

Hybrid-MH (A = 50) 

0.5 

4902.3(6.936) 

3579.686(135.726) 

Conditional 

0.5 

10141.6(237.735) 

905.444(41.475) 

Marginal 

0.5 

4757.2(37.077) 

2944.065(195.011) 

Normalized Stable process 

Hybrid 

0.3 

5054.7(70.675) 

5324.146(167.843) 

Hybrid-MH (A = 0) 

0.3 

7866.4(803.228) 

5074.909(100.300) 

Conditional 

0.3 

NA 

NA 

Marginal 

0.3 

7658.3(193.773) 

2630.264(429.877) 

Hybrid 

0.5 

5382.9(57.561) 

4877.378(469.794) 

Hybrid-MH (A = 50) 

0.5 

4537.2(37.292) 

4454.999(348.356) 

Conditional 

0.5 

10033.1(22.647) 

912.382(167.089) 

Marginal 

0.5 

8203.1(106.798) 

3139.412(351.788) 

Normalized Generalized Gamma process (r = 1) 

Hybrid 

0.3 

4157.8(92.863) 

5104.713(200.949) 

Hybrid-MH (A = 0) 

0.3 

4745.5(187.506) 

4848.560(312.820) 

Conditional 

0.3 

NA 

NA 

Marginal 

0.3 

7685.8(208.98) 

3587.733(569.984) 

Hybrid 

0.5 

6299.2(102.853) 

4646.987(370.955) 

Hybrid-MH (A = 50) 

0.5 

4686.4(35.661) 

4343.555(173.113) 

Conditional 

0.5 

10046.9(206.538) 

1000.214(70.148) 

Marginal 

0.5 

8055.6(93.164) 

4443.905(367.297) 

-logBeta (a = 1,6 = 2) 

Hybrid 


2520.6(121.044) 

3068.174(540.111) 

Conditional 


NA 

NA 

Marginal 

- 

NA 

NA 


Table 1: Running times in seconds and ESS averaged over 10 chains, 30,000 iterations, 10,000 bum in. 


Our novel hybrid sampler exploits marginal and conditional samplers advantages. It has less memory re¬ 
quirements since it just represents the size-biased weights of occupied as opposed to conditional samplers 
which represent both empty and occupied clusters. Also, it does not integrate out the size-biased weights 
thus, we obtain a more comprehensive representation of the RPM. 


4 Performance assesssment 


We illustrate the performance of our hybrid sampler on a range of Bayesian nonparametric mixture models, 
obtained by different specifications of p and 7 , as in Equation ([^. At the top level of this hierarchical spec¬ 
ification, different Bayesian nonparametric priors were chosen from both classes presented in the examples 
section. We chose the base distribution Hq and the likelihood term F for the fcth cluster to be 

Ho{dpk) =Af {dpk I fJ-o,cro) and F(dxi,... ,da;„^ | Pk,Ti) = YYi=i-^ I fJ-k.crl) , 

where are the Uk observations assigned to the fcth cluster at some iteration. Af denotes a Normal 

distribution with mean pk and variance crj, a common parameter among all clusters. The mean’s prior 
distribution is Normal, centered at po and with variance ctq. Although the base distribution is conjugate to 
the likelihood we treated it as non-conjugate case and sampled the parameters at each iteration rather than 
integrating them out. 
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We used the dataset from Roeder ll26l to test the algorithmic performance in terms of running time and 
effective sample size (ESS), as Table [T] shows. The dataset consists of measurements of velocities in km/sec 
of n = 82 galaxies from a survey of the Corona Borealis region. For the cr-Stable Poisson-Kingman class, 
we compared it against a variation of Favaro & Walker Q’s conditional sampler and against the marginal 
sampler of Lomeli et al. ini .We chose to compare our hybrid sampler against this existing approaches 
which follow the same general purpose paradigm. 

Table[T]shows that different choices of a result in differences in the algorithm’s running times and ESS. The 
reason for this is that in the a = 0.5 case there are readily available random number generators which do not 
increase the computational cost. In contrast, in the a = 0.3 case, a rejection sampler method is needed every 
time a new size-biased weight is sampled which increases the computational cost, see Favaro et al. il for 
details. Even so, in most cases, we outperform both marginal and conditional MCMC schemes in terms of 
running times and in all cases, in terms of ESS. In the Hybrid-MH case, even thought the ESS and running 
times are competitive, we found that the acceptance rate is not optimal, we are currently exploring other 
choices of proposals. Finally, in Example b), our approach is the only one available and it has good running 
times and ESS. This qualitative comparison conhrms our previous statements about our novel approach. 


5 Discussion 

Our main contribution is our Hybrid MCMC sampler as a general purpose tool for inference with a very 
large class of infinite mixture models. We argue in favour of an approach in which a generic algorithm can 
be applied to a very large class of models, so that the modeller has a lot of flexibility in choosing specific 
models suitable for his/her problem of interest. Our method is a hybrid approach since it combines the perks 
of the conditional and marginal schemes. Indeed, our experiments confirm that our hybrid sampler is more 
efficient since it outperforms both marginal and conditional samplers in running times in most cases and in 
ESS in all cases. 

We introduced a new compact way of representing the inhnite dimensional component of the model such 
that it is feasible to perform inference and dealing with the corresponding intractabilities. However, there are 
still various intractabilities and challenges that remain when dealing with this type of models. For example, 
we would like to stress that there are some values for a where we are unable to perform inference with our 
novel sampler. Furthermore, there could be other ways to improve the mixing when a Metropolis-Hastings 
step is chosen in terms of better proposals. We consider these points to be an interesting avenue of future 
research. 
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A Pseudocode 


Algorithm 1 HybridSamplercr-PK(A:, V, c, , {Y*}cen„ , Hq, M) 

for t = 2 —> iter do 

Update Slice sample P (U e du | rest) 

Update sf ^ for i = 1,..., fc : Slice sample P ^ e ds^ | rest^ 

Update AddTable&ReUse(u, n„, M, {Uc*}cen„, {^c}cen„, Ho | rest) 

end for 


Algorithm 2 AddTable&ReUse(U, n„, M, {, {Y*},^n„, {^c}c6n„, Hq \ rest) 

Let c G n„ be such that i e c 
c ^ c\{z} 

if c = 0 then 

k ~ UniformDiscrete(^) 

n„ ^ n„\{c} 

V <-v+ Jc 

end if 


o Add back the discarded table size to the surplus 


S« c'according .oP(c. - c | c_.,Resl)oc { “'*“"8 


I n’) 


M 

if d G [M] then 

Jnew <—ExactSampleNewTahleSize({ JcjceTT, n„ = 7r,Rest) 

^ - Jnew 

n„ <- n„ u {{i}} 
u^} 

Y!, ~ Ho 

else 

d d Kj {i} 

end if 

Draw Ho 


if new 


■ Remove it from the old surplus 
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Algorithm 3 ExactSampleNewTableSize(y, ct, Rest) 


if CT = 0.5 then 

G ~ Gamma (|, l) 


IG ~ Inverse Gamma (j, -^V 

V — Vg 
Vjtick vg+VTg 

Jnew ^stick^ 


if CT < 0.5 && (T = Wo- G N then 

^cr 

A = ullVa" 

IG Inverse Gamma (^1 ~ ^ 




T/ _ (-7 

rsticfc — g+IG 

Jnew ^stick^ 

end if 
end if 


Algorithm 4 HybridSampler-MH-(TPK(A', S, V, c, 6, Y, X, M) 
for t = 2 — > iter do 

Update for i = 1,... ,k: Slice sample P (^Ji e dsi | rest^ 

Update MH step for P (U e dw | rest) with independent proposal Stablernd(cr) or 

ExpTiltStablernd(A, cr) . 

Update 7r(*), AddTable&ReUse(u, n„, M, {Uc*}cen„, {Jc}cen„, Ho | rest) 

end for 
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